Convergence Rates of Latent Topic Models Under Relaxed Identifiability Conditions

نویسنده

  • Yining Wang
چکیده

In this paper we study the frequentist convergence rate for the Latent Dirichlet Allocation (Blei et al., 2003) topic models. We show that the maximum likelihood estimator converges to one of the finitely many equivalent parameters in Wasserstein’s distance metric at a rate of n without assuming separability or non-degeneracy of the underlying topics and/or the existence of more than three words per document, thus generalizing the previous works of Anandkumar et al. (2012, 2014) from an information-theoretical perspective. We also show that the n convergence rate is optimal in the worst case.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabular...

متن کامل

Convergence of Latent Mixing Measures in Finite and Infinite Mixture Models By

This paper studies convergence behavior of latent mixing measures that arise in finite and infinite mixture models, using transportation distances (i.e., Wasserstein metrics). The relationship between Wasserstein distances on the space of mixing measures and f -divergence functionals such as Hellinger and Kullback–Leibler distances on the space of mixture distributions is investigated in detail...

متن کامل

Convergence of latent mixing measures in finite and infinite mixture models

We consider Wasserstein distances for assessing the convergence of latent discrete measures, which serve as mixing distributions in hierarchical and nonparametric mixture models. We clarify the relationships between Wasserstein distances of mixing distributions and f -divergence functionals such as Hellinger and Kullback-Leibler distances on the space of mixture distributions using various iden...

متن کامل

Convergence of latent mixing measures in nonparametric and mixture models

We consider Wasserstein distance functionals for assessing the convergence of latent discrete measures, which serve as mixing distributions in hierarchical and nonparametric mixture models. We clarify the relationships between Wasserstein distances of mixing distributions and f -divergence functionals such as Hellinger and Kullback-Leibler distances on the space of mixture distributions using v...

متن کامل

When are Overcomplete Representations Identifiable? Uniqueness of Tensor Decompositions Under Expansion Constraints

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime. While general overcomplete admixtures are not identifiable, we establish {\em generic} ident...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.11070  شماره 

صفحات  -

تاریخ انتشار 2017